Abstract
Introduction:
Genome editing technologies, such as clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9), are widely used to engineer universal allogeneic chimeric antigen receptor T (UCAR-T) cells by disrupting genes that cause immune rejection. While these approaches enable powerful therapeutic applications, they also pose risks of unintended genomic alterations, particularly structural variations (SVs), that may compromise product safety and efficacy. Primer-extension-mediated sequencing (PEM-seq) is widely used to assess risky SVs induced by genome editing. However, PEM-seq is limited by primer binding site dependency and amplification bias, leading to inaccurate assessments of genome editing efficiencies and incomplete detection of SVs. Here, we developed Target Enrichment Long-range Sequencing (TELS), a novel method that enables precise quantification of genome editing efficiency while significantly reducing the underdetection of SVs. We demonstrate that TELS provides a more robust evaluation of SVs in genome-edited UCAR-T cells.
Methods:
CD3+ T cells were isolated from peripheral blood of healthy donor and electroporated with CRISPR/Cas9 lipid nanoparticle to simultaneously knock out TRAC, CD7, and CIITA. Genome-edited cells were cultured for 11 days, after which PEM-seq and TELS were applied to detect SVs . Six PEM-seq assays were performed independently following previously published protocols, with two assays (+ and - strands) conducted for each of the three target loci. For TELS, genomic fragments spanning ±20 kb around each target site were enriched via hybrid capture, followed by long-read PacBio HiFi sequencing. An in-house bioinformatics pipeline was developed to identify small insertions/deletions (indels) and SVs. Non-edited control cells were assayed in parallel to characterize background noise for both methods.
Results:
For PEM-seq, 20 μg of genomic DNA (gDNA) input yielded a minimum of 138,013 unique events (33 million PE150 reads), representing 2.24% of input genome. Genome editing efficiencies (indels frequencies) for TRAC, CD7, and CIITA were 97.33% (+)/98.99% (-), 98.05% (+)/99.24% (-), and 43.40% (+)/26.55% (-), respectively. Across six PEM-seq assays, 8,734 large deletions (>100 bp deletion, 0.0237% of total genomes) and 6,835 translocations (0.0186% of total genomes) were detected. For TELS, 960 ng of gDNA input achieved an average coverage depth of 58,261× (6.5 million PacBio HiFi reads, read lengths primarily distributed between 3–7 kb), representing 19.77 % of input genome. TELS determined genome editing efficiencies (indels frequencies) were 96.21% (TRAC), 94.84% (CD7), and 53.08% (CIITA). Cumulatively, 13,707 large deletions (>100 bp deletion, 4.651% of total genomes) and 2,530 translocations (0.858% of total genomes) were identified across all three loci.
Key Findings:For CIITA locus, genome editing efficiencies between + and − PEM-seq assays differed by 16.85%, indicating substantial biases introduced during linear amplification and polymerase chain reaction processes. And the low enrichment efficiency (2.24%) further suggests that excessive initial DNA input may exacerbate these biases. With equal amounts of gDNA input, TELS detected 196.2-fold more large deletions (4.651% vs. 0.0237%) and 46.1-fold more translocations (0.858% vs. 0.0186%) than PEM-seq. Most PEM-seq-undetected SVs shared a common feature: their breakpoints resided outside PEM-seq primer binding regions, rendering them unamplifiable.